| |
| |
|
Meta-Learning in the Area of Data Mining
Kučera, Petr ; Hlosta, Martin (referee) ; Bartík, Vladimír (advisor)
This paper describes the use of meta-learning in the area of data mining. It describes the problems and tasks of data mining where meta-learning can be applied, with a focus on classification. It provides an overview of meta-learning techniques and their possible application in data mining, especially model selection. It describes design and implementation of meta-learning system to support classification tasks in data mining. The system uses statistics and information theory to characterize data sets stored in the meta-knowledge base. The meta-classifier is created from the base and predicts the most suitable model for the new data set. The conclusion discusses results of the experiments with more than 20 data sets representing clasification tasks from different areas and suggests possible extensions of the project.
|
|
Adaptive Matchmaking Algorithms for Computational Multi-Agent Systems
Kazík, Ondřej ; Neruda, Roman (advisor) ; Paprzycki, Marcin (referee) ; Diamantini, Claudia (referee)
The multi-agent systems (MAS) has proven their suitability for implementation of complex software systems. In this work, we have analyzed and designed the data mining MAS by means of role-based organizational model. The organiza- tional model and the model of data mining methods have been formalized in the description logic. By matchmaking which is the main subject of our research, we understand the recommendation of computational agents, i.e. agents encap- sulating some computational method, according their capabilities and previous performances. The matchmaking thus consist of two parts: querying the ontol- ogy model and the meta-learning. Three meta-learning scenarios were tested: optimization in the parameter space, multi-objective optimization of data min- ing processes and method recommendation. A set of experiments in these areas have been performed. 1
|
|
Evolutionary optimization of machine learning workflows
Suchopárová, Gabriela ; Neruda, Roman (advisor) ; Pilát, Martin (referee)
This work deals with automated machine learning (AutoML), which is a field that aims to automatize the process of model selection for a given machine learning problem. We have developed a system that, for a given supervised learning task represented by a dataset, finds a suitable pipeline - combination of machine learning, ensembles and preprocessing methods. For the search we designed a special instance of the developmental genetic programming which enables us to encode directed acyclic graph pipelines into a tree representation. The system is implemented in the Python programming language and operates on top of the scikit-learn library. The performance of our solution was tested on 72 datasets of the OpenML-CC18 benchmark with very good results. 1
|
|
Estimating performance of classifiers from dataset properties
Todt, Michal ; Polák, Petr (advisor) ; Baruník, Jozef (referee)
The following thesis explores the impact of the dataset distributional prop- erties on classification performance. We use Gaussian copulas to generate 1000 artificial dataset and train classifiers on them. We train Generalized linear models, Distributed Random forest, Extremely randomized trees and Gradient boosting machines via H2O.ai machine learning platform accessed by R. Classi- fication performance on these datasets is evaluated and empirical observations on influence are presented. Secondly, we use real Australian credit dataset and predict which classifier is possibly going to work best. The predicted perfor- mance for any individual method is based on penalizing the differences between the Australian dataset and artificial datasets where the method performed com- paratively better, but it failed to predict correctly. 1
|
|
Adaptive Matchmaking Algorithms for Computational Multi-Agent Systems
Kazík, Ondřej ; Neruda, Roman (advisor) ; Paprzycki, Marcin (referee) ; Diamantini, Claudia (referee)
The multi-agent systems (MAS) has proven their suitability for implementation of complex software systems. In this work, we have analyzed and designed the data mining MAS by means of role-based organizational model. The organiza- tional model and the model of data mining methods have been formalized in the description logic. By matchmaking which is the main subject of our research, we understand the recommendation of computational agents, i.e. agents encap- sulating some computational method, according their capabilities and previous performances. The matchmaking thus consist of two parts: querying the ontol- ogy model and the meta-learning. Three meta-learning scenarios were tested: optimization in the parameter space, multi-objective optimization of data min- ing processes and method recommendation. A set of experiments in these areas have been performed. 1
|
| |
|
Meta-Learning in the Area of Data Mining
Kučera, Petr ; Hlosta, Martin (referee) ; Bartík, Vladimír (advisor)
This paper describes the use of meta-learning in the area of data mining. It describes the problems and tasks of data mining where meta-learning can be applied, with a focus on classification. It provides an overview of meta-learning techniques and their possible application in data mining, especially model selection. It describes design and implementation of meta-learning system to support classification tasks in data mining. The system uses statistics and information theory to characterize data sets stored in the meta-knowledge base. The meta-classifier is created from the base and predicts the most suitable model for the new data set. The conclusion discusses results of the experiments with more than 20 data sets representing clasification tasks from different areas and suggests possible extensions of the project.
|
| |